Weight and Gradient Centralization in Deep Neural Networks

نویسندگان

چکیده

Batch normalization is currently the most widely used variant of internal for deep neural networks. Additional work has shown that weights and additional conditioning as well gradients further improve generalization. In this work, we combine several these methods thereby increase generalization The advantage newer compared to batch not only increased generalization, but also have be applied during training and, therefore, do influence running time use. https://atreus.informatik.uni-tuebingen.de/seafile/d/8e2ab8c3fdd444e1a135/?p=%2FWeightAndGradientCentralization&mode=list.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Gradient conjugate priors and deep neural networks

The paper deals with learning the probability distribution of the observed data by artificial neural networks. We suggest a so-called gradient conjugate prior (GCP) update appropriate for neural networks, which is a modification of the classical Bayesian update for conjugate priors. We establish a connection between the gradient conjugate prior update and the maximization of the log-likelihood ...

متن کامل

On weight initialization in deep neural networks

A proper initialization of the weights in a neural network is critical to its convergence. Current insights into weight initialization come primarily from linear activation functions. In this paper, I develop a theory for weight initializations with non-linear activations. First, I derive a general weight initialization strategy for any neural network using activation functions differentiable a...

متن کامل

Preconditioned Stochastic Gradient Langevin Dynamics for Deep Neural Networks

Effective training of deep neural networks suffers from two main issues. The first is that the parameter spaces of these models exhibit pathological curvature. Recent methods address this problem by using adaptive preconditioning for Stochastic Gradient Descent (SGD). These methods improve convergence by adapting to the local geometry of parameter space. A second issue is overfitting, which is ...

متن کامل

Projection Based Weight Normalization for Deep Neural Networks

Optimizing deep neural networks (DNNs) often suffers from the ill-conditioned problem. We observe that the scaling-based weight space symmetry property in rectified nonlinear network will cause this negative effect. Therefore, we propose to constrain the incoming weights of each neuron to be unit-norm, which is formulated as an optimization problem over Oblique manifold. A simple yet efficient ...

متن کامل

Weight Initialization of Deep Neural Networks(DNNs) using Data Statistics

Deep neural networks (DNNs) form the backbone of almost every state-of-the-art technique in the fields such as computer vision, speech processing and text analysis. The recent advances in computational technology have made the use of DNNs more practical. Despite the overwhelming performances by DNN and the advances in computational technology, it is seen that very few researchers try to train t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Lecture Notes in Computer Science

سال: 2021

ISSN: ['1611-3349', '0302-9743']

DOI: https://doi.org/10.1007/978-3-030-86380-7_19